Translation Memory Engines: A Look under the Hood and Road Test

نویسنده

  • Timothy Baldwin
چکیده

In this paper, we compare the relative effects of segment order, segmentation and segment contiguity on the retrieval performance of a translation memory system. We take a selection of both bag-of-words and segment order-sensitive string comparison methods, and run each over both characterand word-segmented data, in combination with a range of local segment contiguity models (in the form of N-grams). Over two distinct datasets, we find that indexing according to simple character bigrams produces a retrieval accuracy superior to any of the tested word N-gram models. Further, in their optimal configuration, bag-of-words methods are shown to be equivalent to segment order-sensitive methods in terms of retrieval accuracy, but much faster. We also provide evidence that our findings scale over larger-sized translation memories.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مقایسه کارکرد به روز کردن حافظه فعال در سه گروه سوء مصرف کنندگان مواد (هروئین، تریاک)، تحت درمان با متادون و بهنجار

Introduction: Chronic use of opiates is associated with a wide range of neuropsychological deficits. Therefore, this study aimed to evaluate one of the neuropsychological functions, updating function of working memory, in three groups, including substance abusers (heroin and opium), those under treatment with methadone, and normal controls. Methods: The method of this study was causal-compar...

متن کامل

L2 Learners’ Strategy Preference in Metaphorical Test Performance: Effects of Working Memory and Cognitive Style

Although investigating the factors that influence test scores is important, a majority of stakeholders show a paucity of attention towards individual learner differences due to having large classes of L2 learners. This study sought to explore the possible effect of working memory and cognitive style on L2 learners’ metaphorical test performance. The study was conducted in 2 phases. The first ph...

متن کامل

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

Road Departure Avoidance System Based on the Driver Decision Estimator

In this paper a robust road departure avoidance system based on a closed-loop driver decision estimator (DDE) is presented. The main idea is that of incorporating the driver intent in the control of the vehicle. The driver decision estimator computes the vehicle look ahead lateral position based on the driver input and uses this position to establish the risk of road departure. To induce a risk...

متن کامل

An Investigation of Cognitive Processes of Interpretation from Persian to English

This study examined the cognitive processes in interpretation through employing Think-aloud Protocols (TAPs) among Iranian translators. The participants included 10 professional and nonprofessional translators selected through Nelson Proficiency Test. TAP and retrospective interview were used as the major instruments in order to collect the data from self-reports protocols. In order to assess t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005